Object Detection

1. Overview

YOLOv11 is the latest-generation object detection model released by Ultralytics, delivering an exceptional balance between speed and accuracy. When deployed locally on NVIDIA Jetson devices (such as Orin Nano, NX, or AGX), YOLOv11 enables efficient, low-latency AI inference optimized for edge environments.

yolo_overview

This guide covers:

Preparing the environment and installing JetPack
Running YOLOv11 quickly using Docker
Installing YOLOv11 and its dependencies locally
Accelerating inference with TensorRT
Leveraging DLA acceleration and benchmarking performance

YOLOv11 runs exceptionally well on devices such as the Jetson Orin Nano, offering a powerful solution for real-time edge AI deployment.

2. Environment Preparation

Hardware Support

Device	Supported JetPack Version	AI Performance
Jetson Nano	JetPack 4.6.x	472 GFLOPS
Jetson Xavier NX	JetPack 5.1.x	21 TOPS
Jetson Orin NX 16GB	JetPack 6.x	100 TOPS
Jetson Orin Nano Super	JetPack 6.x	67 TOPS

For optimal results, it is recommended to use JetPack 5.1 or later and enable maximum performance mode：

sudo nvpmodel -m 0
sudo jetson_clocks

3. Quick Start via Docker（Recommended）

The fastest way to get started is by using the prebuilt Docker image provided by Ultralytics. Run the following commands:

sudo docker pull ultralytics/ultralytics:latest-jetson-jetpack6
sudo docker run -it --ipc=host --runtime=nvidia ultralytics/ultralytics:latest-jetson-jetpack6

Includes YOLOv11, PyTorch, Torchvision, TensorRT, and other required dependencies.

4. Local Installation YOLOv11 (Optional)

This option is intended for users who need a customized environment.

Step 1: Set Up the Python Environment

sudo apt update
sudo apt install python3-pip -y
pip install -U pip

Step 2: Install the YOLOv11 Package

pip install ultralytics[export]

Step 3: Install Compatible PyTorch and Torchvision

While ultralytics automatically installs PyTorch and Torchvision, the pip-installed versions are not compatible with ARM64-based Jetson platforms.Instead, you need to manually install a prebuilt PyTorch wheel and build Torchvision from source.

Example: JetPack 6.1 + Python 3.10

pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torch-2.5.0a0+872d972e41.nv24.08-cp310-cp310-linux_aarch64.whl
pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/torchvision-0.20.0a0+afc54f7-cp310-cp310-linux_aarch64.whl

Install cuSPARSELt to Resolve Torch 2.5.0 Dependency:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/arm64/cuda-keyring_1.1-1_all.deb
sudo dpkg -i cuda-keyring_1.1-1_all.deb
sudo apt-get update
sudo apt-get -y install libcusparselt0 libcusparselt-dev

Verify PyTorch Version and GPU Availability:

python3 -c "import torch; print(torch.__version__)" # 2.5.0a0+872d972e41.nv24.08
python3 -c "import torch; print(torch.cuda.is_available())" # True

Step 4: Install ONNX Runtime (GPU)

To find all available onnxruntime-gpu packages —organized by JetPack version, Python version, and other compatibility details—in the Jetson Zoo ONNX Runtime Compatibility Matrix. In this example, we will download and install onnxruntime-gpu 1.20.0 with support for Python 3.10.

pip install https://github.com/ultralytics/assets/releases/download/v0.0.0/onnxruntime_gpu-1.20.0-cp310-cp310-linux_aarch64.whl

5. Accelerating YOLOv11 Inference with TensorRT

Ultralytics supports exporting YOLOv11 models to TensorRT engine files （.engine），to significantly improve inference performance.

Python Example

from ultralytics import YOLO

model = YOLO("yolo11n.pt")
model.export(format="engine")  # 生成 yolo11n.engine

trt_model = YOLO("yolo11n.engine")
results = trt_model("https://ultralytics.com/images/bus.jpg")

CLI Example

# Export a YOLO11n PyTorch model to TensorRT format
yolo export model=yolo11n.pt format=engine # creates 'yolo11n.engine'
# Run inference with the exported model
yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg'

6. Using DLA (Deep Learning Accelerator)

Some Jetson devices feature built-in DLA (Deep Learning Accelerator) cores that enable lower power consumption and improved parallel inference.

Python Example

model.export(format="engine", device="dla:0", half=True)

CLI Example

# Export a YOLO11n PyTorch model to TensorRT format with DLA enabled (only works with FP16 or INT8)
# Once DLA core number is specified at export, it will use the same core at inference
yolo export model=yolo11n.pt format=engine device="dla:0" half=True # dla:0 or dla:1 corresponds to the DLA cores
# Run inference with the exported model on the DLA
yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg'

Note: Some model layers may not run entirely on the DLA and will fall back to GPU execution if unsupported.

7、object detection Example

import cv2
import time
from ultralytics import YOLO

# Load the TensorRT engine model (exported from YOLO11)
model = YOLO("yolo11n.engine")  # Replace with the path to your .engine model

# Open the USB camera (usually /dev/video0)
cap = cv2.VideoCapture(0)

# Set camera resolution (match model input size for best performance)
cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)

# Initialize FPS calculation variables
fps = 0.0
frame_count = 0
start_time = time.time()
# Check if the camera opened successfully
if not cap.isOpened():
    print("❌ Cannot open camera")
    exit()
print("📸 Real-time detection started. Press 'q' to quit.")
while True:
    # Read a frame from the camera
    ret, frame = cap.read()
    if not ret:
        break
    # Start time for inference
    t0 = time.time()
    # Run inference
    results = model(frame)
    # Plot the results (draw bounding boxes, labels, etc.)
    annotated = results[0].plot()
    # Calculate FPS
    frame_count += 1
    t1 = time.time()
    fps = 1. / (t1 - t0)
    # Draw FPS on the frame
    cv2.putText(annotated, f"FPS: {fps:.2f}", (10, 30),
                cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
    # Show the annotated frame
    cv2.imshow("YOLO11 - TensorRT Real-time Detection", annotated)
    # Exit on 'q' key press
    if cv2.waitKey(1) & 0xFF == ord("q"):
        break
# Release camera and close display window
cap.release()
cv2.destroyAllWindows()

7. Benchmark Performance Comparison

Model Format	Orin Nano（ms）	mAP50-95	Orin NX（ms）
PyTorch	21.3	0.6176	19.5
TorchScript	13.4	0.6100	13.03
TensorRT (FP16)	4.91	0.6096	4.85
TensorRT (INT8)	3.91	0.3180	4.37

✅ TensorRT (FP16) achieves the best balance between speed and accuracy. ⚠️ INT8 offers the fastest inference speed but with a significant accuracy drop.

8. Optimization Tips

Optimization	Recommended Command
Power Mode	`sudo nvpmodel -m 0`
CPU/GPU Frequency	`sudo jetson_clocks`
System Monitoring	`sudo pip install jetson-stats` → `jtop`
Memory Management	Use swap efficiently, clean cache when needed

9. Troubleshooting

Issues	Solution
Cannot import PyTorch after installation	Make sure you're using the `.whl` package specifically built for Jetson
TensorRT inference is slower than expected	Check if `jetson_clocks` is enabled and use FP16 mode
Failed to pull Docker image	Ensure Docker is properly installed and run with `--runtime=nvidia`
No module named 'tensorrt' in virtual env	Copy the package from host to venv: `cp -r /usr/lib/python3.10/dist-packages/tensorrt your_venv/lib/python3.10/site-packages/`

1. Overview​

2. Environment Preparation​

Hardware Support​

3. Quick Start via Docker（Recommended）​

4. Local Installation YOLOv11 (Optional)​

Step 1: Set Up the Python Environment​

Step 2: Install the YOLOv11 Package​

Step 3: Install Compatible PyTorch and Torchvision​

Step 4: Install ONNX Runtime (GPU)​

5. Accelerating YOLOv11 Inference with TensorRT​

Python Example​

CLI Example​

6. Using DLA (Deep Learning Accelerator)​

Python Example​

CLI Example​

7、object detection Example​

​

7. Benchmark Performance Comparison​

8. Optimization Tips​

9. Troubleshooting​

References​

1. Overview

2. Environment Preparation

Hardware Support

3. Quick Start via Docker（Recommended）

4. Local Installation YOLOv11 (Optional)

Step 1: Set Up the Python Environment

Step 2: Install the YOLOv11 Package

Step 3: Install Compatible PyTorch and Torchvision

Step 4: Install ONNX Runtime (GPU)

5. Accelerating YOLOv11 Inference with TensorRT

Python Example

CLI Example

6. Using DLA (Deep Learning Accelerator)

Python Example

CLI Example

7、object detection Example

7. Benchmark Performance Comparison

8. Optimization Tips

9. Troubleshooting

References